Variance reduction trends on 'boosted' classifiers
نویسنده
چکیده
Ensemble classification techniques such as bagging, (Breiman, 1996a), boosting (Freund & Schapire, 1997) and arcing algorithms (Breiman, 1997) have received much attention in recent literature. Such techniques have been shown to lead to reduced classification error on unseen cases. Even when the ensemble is trained well beyond zero training set error, the ensemble continues to exhibit improved classification error on unseen cases. Despite many studies and conjectures, the reasons behind this improved performance and understanding of the underlying probabilistic structures remain open and challenging problems. More recently, diagnostics such as edge and margin (Breiman, 1997; Freund & Schapire, 1997; Schapire et al., 1998) have been used to explain the improvements made when ensemble classifiers are built. This paper presents some interesting results from an empirical study performed on a set of representative datasets using the decision tree learner C4.5 (Quinlan, 1993). An exponential-like decay in the variance of the edge is observed as the number of boosting trials is increased. i.e. boosting appears to ’homogenise’ the edge. Some initial theory is presented which indicates that a lack of correlation between the errors of individual classifiers is a key factor in this variance reduction.
منابع مشابه
Improving 2D Boosted Classifiers Using Depth LDA Classifier for Robust Face Detection
Face detection plays an important role in Human Robot Interaction. Many of services provided by robots depend on face detection. This paper presents a novel face detection algorithm which uses depth data to improve the efficiency of a boosted classifier on 2D data for reduction of false positive alarms. The proposed method uses two levels of cascade classifiers. The classifiers of the first lev...
متن کاملBoosting Random Forests to Reduce Bias; One-Step Boosted Forest and its Variance Estimate
In this paper we propose using the principle of boosting to reduce the bias of a random forest prediction in the regression setting. From the original random forest fit we extract the residuals and then fit another random forest to these residuals. We call the sum of these two random forests a one-step boosted forest. We have shown with simulated and real data that the one-step boosted forest h...
متن کاملInterpretable Boosted Naïve Bayes Classification
Voting methods such as boosting and bagging provide substantial improvements in classification performance in many problem domains. However, the resulting predictions can prove inscrutable to end-users. This is especially problematic in domains such as medicine, where end-user acceptance often depends on the ability of a classifier to explain its reasoning. Here we propose a variant of the boos...
متن کاملWeighted Proportional k-Interval Discretization for Naive-Bayes Classifiers
The use of different discretization techniques can be expected to affect the classification bias and variance of naive-Bayes classifiers. We call such an effect discretization bias and variance. Proportional kinterval discretization (PKID) tunes discretization bias and variance by adjusting discretized interval size and number proportional to the number of training instances. Theoretical analys...
متن کاملModeling Consensus: Classifier Combination for Word Sense Disambiguation
This paper demonstrates the substantial empirical success of classifier combination for the word sense disambiguation task. It investigates more than 10 classifier combination methods, including second order classifier stacking, over 6 major structurally different base classifiers (enhanced Naïve Bayes, cosine, Bayes Ratio, decision lists, transformationbased learning and maximum variance boost...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JAMDS
دوره 8 شماره
صفحات -
تاریخ انتشار 2004